WORLD INTELLECTUAL PROPERTY ORGANIZATION 
International Bureau 




PCT 

INTERNATIONAL APPLICATION PUBLISHED UNDER THE PATENT COOPERATION TREATY (PCT) 



(51) International Patent Classification 6 : 
H04N 7/10 



Al 



(11) International Publication Number: WO 00/05884 

(43) International Publication Date: 3 February 2000 (03.02.00) 



(21) International Application Number: PCT/IL99/00393 

(22) International FUing Date: 18 July 1999 (18.07.99) 



(30) Priority Data: 

60/093,366 



20 July 1998 (20.07.98) 



US 



(71) Applicant (for all designated States except US): MATE 

- MEDIA ACCESS TECHNOLOGIES LTD. PL/IL]; 
Africa-Israel House, Hahoresh Road 4, 56547 Yehud (IL). 

(72) Inventors; and 

(75) Inventors/Applicants (for US only): WILF, Itzhak [IL/ILj; 
Sapir Street 5, 60190 Neve Monoson (IL). KALM AN- 
SON, Dan [IL/IL]; Kerem Hazeitim 8, 56547 Savion (IL). 
GREENSPAN, Hayit [IL/IL]; Shikun Banim 136, 79695 
Kfar Bilu (IL). 

(74) Agent: EITAN, PEARL, LATZER & COHEN-ZEDEK; Gav 
Yam Center 2, Shenkar Street 7, 46725 Herzlia (IL). 



(81) Designated States: AE, AL, AM, AT, AU, AZ, BA, BB, BG, 
BR, BY, CA, CH, CN, CU, CZ, DE, DK, EE, ES, FI, GB, 
GD, GE, GH, GM, HR, HU, ID, IL, IN, IS, JP, KE, KG, 
KP, KR, KZ, LC, LK, LR, LS, LT, LU, LV, MD, MG, MK, 
MN, MW, MX, NO, NZ, PL, PT, RO, RU, SD, SE, SG, SI, 
SK, SL, TJ, TM, TR, TT, UA, UG, US, UZ, VN, YU, ZA, 
ZW, ARIPO patent (GH, GM, KE, LS, MW, SD, SL, SZ, 
UG, ZW), Eurasian patent (AM, AZ, BY, KG, KZ, MD, 
RU, TJ, TM), European patent (AT, BE, CH, CY, DE, DK, 
ES, FI, FR, GB, GR, IE, IT, LU, MC, NL, PT, SE), OAP1 
patent (BF, BJ, CF, CG, CI, CM, GA, GN, GW, ML, MR, 
NE, SN, TD, TG). 



Published 

With international search report. 

Before the expiration of the time limit for amending the 
claims and to be republished in the event of the receipt of 
amendments. 
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(57) Abstract 

A method of selecting, at a video receiver location, a 
desired video program channel from a number of program 
channels (10) transmitting video programs, by: automati- 
cally generating, for each program channel, indexing data 
(115) of at least one predetermined attribute based on the 
video program content of the respective channel; specify- 
ing at least one attribute corresponding to a desired program 
content, such as a program title or the occurrence of any 
event within a program; and identifying, from the indexing 
data, any program channel having a match with respect to 
the attribute specified. Preferably, the indexing data (115) is 
generated at a remote location, is encoded and transmitted 
in a separate indexing channel for all the program channels, 
and is received and decoded at the receiver location, the at- 
tribute corresponding to a desired program being specified 
at the receiver location. 
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A METHOD OF AUTOMATIC SELECTION 
OF VIDEO CHANNELS 

FIELD AND BACKGROUND OF THE INVENTION 

5 The present invention relates to multi-channel video / television 

systems and, in particular, to a method of providing viewers with automated 
selection of channels which match viewer's defined search criteria. 

The number of video channels available over cable television systems 
and satellite television systems increases rapidly. Therefore, users need 

10 improved methods for selecting video channels that at a given time carry a 
preferred program and or content. Similar needs occur in video on demand 
systems, interactive television, and certain internet-television arrangements. 

For years, viewers have relied on pre-printed television program listing. 
There are numerous disadvantages in using an external paper-based 

15 information source, which is updated usually once a week. 

In recent years, television-based electronic program guides (EPG) 
have been developed. Program listing are displayed directly on the TV screen 
and provide better access and ease of updating as compared to pre-printed 
guides. Typically, the EPG is a scrolling TV program list that is transmitted over 

20 a dedicated cable channel. Viewers can tune to the guide channel and view 
information about programs being then transmitted or to be transmitted in the 
near future. 

Another form of dedicated cable channel contains a split screen 
display of the other channels. A video combination device generates the display 
25 such that several video channels (say 16) are displayed concurrently. When the 
number of channels is greater than the capacity of a single display screen, 
several displays are time-toggled to cover the entire set of channels. However, 
the passive nature of this technique limits its value. Also, one cannot search by 
title, genre, channel or view listing for programs scheduled a few days ahead. 
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Several prior art methods are specifically directed to channel 
searching. 

In some prior art methods, the search capabilities are manual and 
therefore disturb the viewing habit. Also, manual techniques are very limited in 
5 situations of hundreds of video channels. 

In other prior art methods, automatic searching is based on 
pre-encoded textual descriptions of the video content. Such descriptions are 
subjective and usually very concise. Closed captions, which are encoded into 
the video signal, contain a transcription of the dialogues but do not relate to any 
10 - visual information. Additionally, no provision is made for events that are 
happening in real time such as a sudden or dramatic event that is as "breaking 
news". Such event is probably not contained in the EPG data. 

There exists a need for an improved television channel selection 
method, which employs automatic searching in video, based on the audio and 
15 video content of the television channels. There exists also a need for the 
method to match the viewer's preferences, specified as a query, with the 
content attributes of the television channels which are extracted automatically 
and in real-time from these channels. 

20 



2 
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BRIEF SUMMARY OF THE INVENTION 

According to one aspect of the present invention, there is provided a 
method of selecting, at a video receiver location, a desired video program 
channel from a number of program channels transmitting video programs, 
comprising: automatically generating for each of the program channels, 
indexing data of at least one predetermined attribute based on the video 
program content of the respective channel; specifying at least one attribute 
corresponding to a desired program content; and identifying, from the indexing 
data, any program channel having a match with respect to the attribute 
specified. 

According to further features in many of the described preferred 
embodiments, the indexing data is generated at a remote location, is encoded 
and transmitted in a separate indexing channel for all the program channels, 
and is received and decoded at the receiver location; and the at least one 
attribute corresponding to a desired program is specified at the receiver 
location. 

According to still further features in the described preferred 
embodiment of the invention, the indexing data is collected from selected 
key-frames of the respective video program; and is tagged with a channel 
identification code and with a time tag. In addition, the indexing data from a 
plurality of channels is multiplexed into a data stream before transmission. 

According to still further features in the described invention, the 
selection of the channel to be viewed may be "event-driven", as an extension of 
the prior art methods of "program-driven". Thus, the video material is 
segmented into multiple sets of "events", such as occurrences of people, 
objects, sounds and other; the occurrence of the events can be detected and 
presented to the viewer at the receiver location. 

Thus, according to another aspect of the invention, there is provided a 
method for indicating at a video receiver location, the occurrence of a particular 
event when occurring on any number of program channels transmitting video 
programs, comprising: automatically generating at a remote location, for each of 

3 
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the program channels, indexing data of the respective video programs: 
encoding and transmitting said indexing data for all the channels; receiving and 
decoding the indexing data: specifying at the receiver location the particular 
event when occurring on any of the programmed channels; and identifying, from 
the indexing data each occurrence of the particular event on any of the program 
channels. 

According to further features with respect to this aspect of the invention, 
the specified event, when occurring on any of the program channels may be 
automatically displayed (e.g., as a picture within a picture) on the screen of the 
video receiver; and / or, may be recorded in a recorder. 

According to one embodiment of the invention, the indexing data is 
transmitted in a separate indexing channel and is received and decoded at the 
receiver location. In a second described embodiment, the indexing data is used 
at a central control node for selecting programs to be transmitted to a plurality 
of viewer stations at a plurality of receiver locations according to the attribute 
specified at the respective receiver location. 

According to another aspect of the present invention, therefore, there is 
provided a method of selecting, at a plurality of viewer locations, a desired 
video program from a plurality of video programs transmitted in a plurality of 
program channels, comprising: automatically indexing, at a remote location, 
attributes of each of the video programs transmitted in the program channels; 
transmitting said attributes of each of the video programs in the program 
channels; receiving, at a central control node, the video programs and the 
attributes thereof; specifying, at each of the viewer locations, particular 
attributes of a video program desired to be viewed at the respective viewer 
location; and utilizing, at the central control node, the attributes specified at the 
viewer location for identifying the video programs matching the specified 
attributes. 

Thus, the system may include a central control node that receives the 
indexed data from the indexing channel and transmits programming over a 
network to multiple viewer stations (e.g. homes). The programming may include 

4 
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standard analog video broadcasts (e.g., NTSC, PAL), digitally encoded video 
broadcasts (e.g. MPEG), or digital information related to computer-executed 
applications. Each viewer station includes at least one video display set (e.g., a 
television receiver) and an interactive station controller which is sometimes 
5 referred to as a set-top box. The interactive station controller at the viewer 
station specifies at least one attribute corresponding to a desired program to be 
viewed. The central control node then identifies from the indexed data received 
and decoded the channel representing the best match with respect to the 
content-based attribute specified and transmits the best-match channel to the 

io receiver location. 

In one described preferred embodiment, the content that is searched 
and detected may be stored in a recording device, enabling future viewing and 
programs/events statistics information gathering. In another described preferred 
embodiment, the data processor at the remote location generates indexing data 

is that is stored in a web server in the internet. 

According to a still further aspect of the present invention, there is 
provided a method of generating a program schedule of desired video program 
channels from a number of program channels transmitting video programs of 
various program contents, comprising: automatically generating, for each of the 

20 program channels, indexing data of at least one predetermined attribute based 
on the content of the programs to be transmitted on the respective channel, and 
the scheduled transmission time thereof; specifying at least one attribute 
corresponding to a desired program content; and identifying, from the indexing 
data, the program channels and the scheduled transmission times thereof, 

25 having a match with respect to the specified attribute to thereby produce a 
program schedule of the program channels. 

Further features and advantages of the invention will be apparent from 
the description below. 

»o 
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RR1EF DESCRIPTION OF THE DRAWINGS 

The invention is herein described, by way of example only, with 
reference to the accompanying drawings, wherein: 

FIG. 1A is a block diagram illustrating an overall system in accordance 
5 with the present invention. 

FIG. 1B illustrates a video channel selection system based on 
automatic searching by content according to one aspect of the present 
invention; 

FIG. 2 illustrates a video channel selection system that includes a 

to central control node; 

FIGS. 3A and 3B depict a partitioning of a video program into 
high-resolution segments, or events; 

FIG. 4 is a flow diagram of preferred steps for selecting video channel 
based on automatic searching by content; 
15 FIG. 5 illustrates a video indexing data multiplexing and encoding 

device according to another aspect of the present invention; 

FIG. 6 illustrates an automatic video indexer according to another 
aspect of the present invention; 

FIG. 7 illustrates a search menu for defining a user query from 
20 pre-defined attributes of audio and video content: 

FIG. 8 illustrates a sequence of video indexing data, which is a 
low-resolution representation of the respective video images; 

FIG. 9 is a flow diagram of preferred steps for searching explosions in 
video based on visual content only; 
25 FIG. 10 illustrates a data sequence of measurements characteristic of 

explosions: 

FIG. 11 is a flow chart of a web-based television channel selection 
system based on automatic searching by content: and 

FIGS. 12A. 12B and 12C illustrate the system used for producing a 
?i) personalized program schedule; and 

6 
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FIG. 13 illustrates the system used for computing topic-oriented video 
summaries of television channels according to one aspect of the present 
invention. 



7 



Patent provided by Sughrue Mion, PLLC - http://www.sughrue.com 



WO 00/05884 



PCT/IL99/00393 



DETAILED DESCRIPTION OF THE PRESENT INVENTION 

This invention presents a method of selecting at a video receiver 
location, a desired video program channel from a number of program channels 
transmitting video programs. FIG. 1A presents an overview of the main system 

5 components. Video channels can be inputted from a variety of sources 10, 
including live video streams, as well as archived video material. For each of 
said program channels, indexing data is generated 20, of at least one 
predetermined attribute 25, based on the video program content of the 
respective channel. The method entails specifying at least one attribute 

io corresponding to a desired program content 30. The attribute may be a desired 
program title, or an occurrence of an event within any program at any channel, 
as will be elaborated on and demonstrated below. While utilizing the indexed 
data and user input attributes, a search is conducted to detect a match of the 
program channels and the attribute specified 40. Multiple attributes may be 

15 specified, along with a set of relating conjunctions, in which case a match is 
detected when all attributes are present, and when the associated conjunctions 
are met. Once a match is detected, the related program and program channel is 
identified 50. Multiple program channels may be identified as having a match. 
The match scores may be sorted, in which case the identified program channels 

2o are identified as the "best match program channel", "second-best match 
channel", and so forth. Identified programs are presented to the viewer, e.g., as 
a picture-in-picture or as a scrolling listing on the main display screen 60. With 
this information in hand, the viewer may select an option such as the viewing of 
the identified program content, or the recording of the content 70. Viewer may 

25 select these options in an interactive operational mode, or may predetermine 
selected actions. The variations within the presented method for video channel 
selection will be discussed as well as possible system embodiments. 

Reference is now made to FIG. 1B. which is a block diagram showing 
a first embodiment of the video channel selection system. For purposes of 

M) simplicity and clarity, the system is described with reference to widely available 
systems and standards, including conventional analog television receivers and 

8 
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cable-based video networks. It will be appreciated, however, that the particular 
components of the channel selection system may be implemented with a variety 
of conventions, standards, or technologies without departing from the 
underlying concepts of the present invention. The term -video-channel selection 
5 system' 1 is used to emphasize the applicability of the invention beyond standard 
television-based systems. The term -video" is used to describe both an 
audio-visual content and the image part of that content which consists of a 

sequence of images. 

The system illustrated in FIG. 1B comprises two parts: one at the 
id transmitter side, and the other at the receiver side. Generally, the transmitter 
side of the system can be located at the service provider's site, and the video 
indexing channels can be encoded and transmitted along with other channels. 
The receiver side includes at least one video display set (e.g., a television 
receiver) and an interactive station controller which is sometimes referred to as 

Li a set-top box. 

The receiver side of the system can be located in a user's set-top 
cable converter box or other signal reception or processing device such as a 
satellite receiver. Alternatively, the components can be mounted in a separate 
housing or included as a part of the television receiver, VCR, personal 

2d computer or multimedia player. 

The transmitter side consists of a set of similar processing paths, one 
for each video channel. Each such path takes a digital video bit-stream 110, 
such as an MPEG2 stream, and decodes the stream in a decoder unit 111, into 
a sequence of video images. The video feed for each channel may be a live 

25 program or a recording on tape. The programming may include standard analog 
video broadcasts (e.g., NTSC. PAL), digitally encoded video broadcasts (e.g. 
MPEG), or digital information related to computer-executed applications. 
Regardless of input format, the bit-stream is converted into a sequence of 
images and the associated sound track in order to enable indexing on a wide 
range of video attributes. 



9 
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When the input for a specific television channel consists of a video 
signal 113, a video digitizer module 114 converts that signal into a digital 
representation of the sequence of images and the associated sound track in a 
format suitable for processing by the video indexer modules 115. The decoded 
or digitized video signals are encoded for transmission in the video indexer 
modules 115. The operation of each video indexer module 115 is described in 
detail in FIG. 6. In that described embodiment video indexing is based on 
key-frames (a subset of the original video frames), which are used as a 
representation for these original video frames. Video content is captured at 
frame rate of 25 frames or 30 frames per second (PAL or NTSC formats, 
respectively). Since the visual information content in video changes at slower 
rates, only a fraction of the video frames is retained, and the indexing data for 
content-attributes are automatically computed for these frames only. Although 
video-indexing data is related to key-frames, additional frames may be needed 
when computing this data. A specific example is in the case of motion where 
frames adjacent to each key-frames are analyzed to extracted motion attributes. 

The key-frames and video indexing data of a number of selected 
channels are time multiplexed in multiplexer 116 (FIG. 1B) into at least one 
video index channel which is later processed to aid the user in television 
channel selection. The sparse nature of key-frames, coupled with the concise 
nature of video indexing data, allows multiplexing such frames and indexing 
data from multiple channels into a unified video index channel. All key-frames 
and associated indexing data are tagged with a channel identification code as 
well as a time-tag, so that channel-specific indexing data can be de-multiplexed 
and reconstructed at the receiver's side. 

Video indexing data is preferably prepared at the transmitter side for 
the following reasons: The computational capacity of the set-top box or 
television receiver is limited: the indexing is done in a user-independent 
manner: and the bandwidth does not allow transmitting multiple channels to the 
receiver for indexing. Note that the "transmitter side" can be any central 
location, such as a cable head-end. as will be discussed below. 

10 
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Since key-frames and indexing data depend on the content of each 
channel, multiplexer 116 is designed to handle several situations that may 
occur. One such situation is the occurrence of two or more key-frames at the 
same time. In such a situation key-frames are either dropped or re-encoded in a 

more concise manner. 

FIG. 1 B shows also a functional block diagram of the remote control 
handset 120 and set-top box controller 130 at the receiver side. The remote 
control handset 120 comprises: a query profile selection signal generator 121 
that generates query profile selection signals in response to depression of 
suitable buttons: an automatic searching signal generator 122, and a regular 
channel changing signal generator 123 which generates channel changing 
signals in response to depression of the channel changing buttons. 

The set-top box controller 130 at the receiver side further includes an 
indexing data decoder 131 in communication with the cable-company and 
connected to a video search engine 132 within controller 130. The latter further 
includes a channel selector 133 controlling a secondary channel receiver 134 
within controller 130 in communication with the cable company. Controller 130 
further includes a primary channel receiver 135, and a user, interface 136. 

FIG. 2 is a block diagram showing an alternate to the video channel 
selection system of FIG. 1B. In FIG. 2, the transmitter 200 and the receivers 210 
are separated by a central control node 220, that includes a receiver 221, a server 
222, a video search engine 223, and a transmitter 224. The receiver 221 receives 
the programs and indexed material from the transmitter 200. The server 221 stores 
the indexed material received from the transmitter 200. searches this material for 
specified attributes in the video search engine 223, and transmits programming 
over a network to the multiple viewer stations 210 (e.g. homes) via cable or 
wireless. 

Each viewer station 210 includes a receiver 211. at least one video 
display set 212 (e.g., a television receiver), and an interactive station controller 213, 
e.g., that sometimes referred to as a set-top box. The server 222 in the central 
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control node 220 matches the indexed video material to a pre-allocated set of 
users. The users' preferences and search criteria are available at the server 21 1 by 
a user-controlled remote server within controller 213, and/or by automatic extraction 
of viewer preferences based on viewing-history profiles effected by the controller 
213. The interactive station controllers 213 at the viewer stations also enable the 
viewers to specify at least one attribute corresponding to a desired program to be 
viewed. The central control node 220 then identifies, from the indexed data 
received and decoded, the channel representing the best match with respect to the 
content-based attribute specified by each user and transmits the best-match 
channel program to the receiver location of the respective viewer. 

In the two embodiments described (FIG. 1B and FIG. 2), the channel 
selection (identification) process takes place in the set-top box (130) and the 
server in the central-control node (220), respectively. These units have access 
to the attributes coming from the receiver end, as well as the indexed data 
generated from the video channels. These units are then responsible for 
detecting any matches and identifying the respective channels, sorted by their 
match; specifically, identifying the channel representing the best match, and the 
channel of the next-best match. The two embodiments described represent two 
possible system architectures. In the one case, the indexed data is generated at 
a remote location, and transmitted to the receiver location, at which place the 
program identification is pursued. In the second case, the indexed data is 
generated at a remote location, transmitted to a centralized location (a 
central-control node), at which place the program identification is pursued and 
conveyed further to receiver locations. It should be noted that additional 
embodiments are possible, such as the generation of the indexing data locally 
at the server location; the detection of a match and identification of selected 
programs at a server location, which may be physically located with the video 
channel providers; and variations thereof. 

An example is in news production houses that generate the video 
material, have indexing data generated in house, provide video material to 



12 
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editorial rooms based on predefined attributes, or user-specified attributes - all 
within a localized architecture. 

With the present invention, video material, both pre-recorded 
programs as well as live material (e.g. live footage coming in a news program), 
are automatically indexed for content. Channel selection is thus enabled based 
on live material as well as predefined program categories. Multiple channels are 

indexed simultaneously. 

The term • content" refers to all visual and auditory information that is 
extracted and indexed automatically from the video streams, in addition to any 
manual annotation available (e.g. the program title, program category etc). 
Moreover, video content relates to the partitioning of video streams into 
segments that correspond with video -attributes". Attributes may represent 
video program titles, as known in prior art, in which case the corresponding 
video segments are full-length video program segments. In this invention, 
attributes may also represent the occurrence of 'events" within a program 
segment, thus partitioning the program into higher-resolution segments of 
content. Hereon, the terms attributes and events are used interchangibly. 

FIG. 3-A depicts a partitioning of a video program segment 300 into 
"event" segments. Examples of attributes and events include: object (e.g., 
people) events 310-313, sport-events (e.g., •goals" 320, 321), and news events 
(e.g., "breaking-news" events 330). Examples of other events that may be 
specified by a user include: action-movie events (e.g., explosions), sound 
events (e.g., President Clinton's voice), spoken-word events (e.g., words about 
politics or the economy), and text-events (e.g., segments that have material 
regarding a certain location and that location is present as overlayed text on top 
of the video segment). Events are detected and indexed across several video 
channels simultaneously, as shown in FIG. 3-B. The time axis is segmented into 
events as they occur ("event-driven"). Each event, indicated as a short line 
segment, is linked to the related channel /video source. 

With this definition of content, the selection of the channel to be 
viewed may be -event-driven", as an extension of the prior art methods of 
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-program-driven". Thus, the system enables searching video channels for 
attributes/events in addition to searching for full-length programs. Occurrences 
of the specified events are detected and presented to the viewer at the receiver 
location. 

The user can predefine a table of attributes of interest. Alternatively, a 
viewer's preference list of search attributes, or a program viewing profile, may 
be learned from the user's viewing history. 

An example of a user's attribute table, or a program and event list, is the 
following: 



Category/Attributes 


Programs/Events 


News 


CNN, NBC, ABC 


Sports 


NBA 


News 


"breaking news" 


Sports 


-Goal" 


People 


"Sharon Stone", "Clinton" 


Words 


"economy", "disaster" 



The user may select a category of interest, such as news or sports; 
within each category ; the user may define programs of interest such as CNN, 
NBC news. The user may choose attributes of interest, such as people or 
keywords, and particular events of interest, such as the appearance of "Sharon 
Stone" and "Clinton", or any words spoken about the "economy". 

Combinations are possible within the selection criteria. The user may 
combine program categories with attribute events, such as selecting to see 
Pres. Clinton only on selected news programs (e.g. CNN); the user may add 
time constraints, such as selecting to be notified of events occurring during all 
evening news channels. 

FIG. 4 is a flow diagram of preferred steps for selecting a television 
channel or any video channel based on automatic searching by content. 

14 
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In step 410. video streams are received from multiple video channels. In step 
420. key-frames are selected from each video channel, based on the video 
content in a manner that represents the content of the video in a concise and 
efficient way; based on the key-frames, additional indexing data, which are 
attributes related to the content of the video, are computed. In step 430, 
key-frames and indexing data from all indexed channels are encoded and 
combined into a much smaller number of indexing streams or files. 

Steps 440 to 480 constitute a particular sequence for channel 
selection by the viewer according to the present invention. In step 440, the 
viewer receives a particular primary video signal to the television receiver or 
set-top box controller; the primary signal is usually displayed on the main video 
display. In step 450, key frames and video-indexing data generated as 
described above are transmitted to the set-top box controller; alternatively, the 
video-indexing data is transmitted to the server at a central node location as 
described above with reference to FIG. 2. In step, 460 a video query is defined 
by manipulating an on-screen menu; alternatively, such a query can be defined 
by the user in a more flexible or pre-programmed manner in the user's personal 
computer and downloaded to the receiver or set-top box controller or server at 
the central control node 210 of FIG. 2. Alternatively, viewer's preference list of 
search attributes is updated based on viewer's query history or viewing profile, 

and used for current search. 

In step 470, the search results, (e.g., in the form of key-frames) from 
video channels other than the primary channel that match the video query, are 
displayed on a secondary display (such as a picture-in-picture (PIP) 
arrangement). Alternatively, results may be presented as a listing. The user 
can, as shown by step 480, select interactively either to switch the primary 
channel, or to record a video channel, according to the search results. 

A functional description of a mutiplexing device for video indexing data 
for use as multiplexer 1 16, FIG. 1B. is illustrated in FIG. 5. Video indexing data 
510 enters the video index multiplexer from a plurality of channels and is 
encoded by the indexing data encoder 520 for each channel. The parallel to 

15 
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serial converter 530 serializes the encoded indexing data streams from the 
plurality of channels. The serial index data stream enters a FIFO buffer 540 and 
exits to an arbitration logic 550 designed to handle co-occurrence of key-frames 
in more than one channel. Then, the encoded indexing data goes through a bit 
rate controller 560 to produce the output time-multiplexed video indexing data. 

The serialization of a number of data channels, the arbitration of 
key-frames, and the control of the data-rate, can be implemented by a number 
of prior art methods from the field of communications that are not specific to the 

present invention. 

FIG. 6 describes an automatic video indexer that may be for video 
indexer 1 15 in FIG. 1 B to select key-frames and to generate indexing data. 

The audio-video data stream is first processed by a key-frame 
selection module 610 to produce a content summary. A number of prior-art 
methods for selecting key-frames are known. Most of them are based on 
detecting video shot transitions and selecting a frame from each shot (generally 
the first one) as a key-frame. In the presence of motion, more key-frames have 
to be selected to represent the content of video including the temporal variation. 
Co-pending Application No. PCT/IL99/00169 by the same assignee as the 
present application, describes a preferred method of selecting key-frames. In 
most types of video content, it is sufficient to select only a few percentage 
points of the original video frames to get a good representation. 

While the summary, which consists of the video key-frames, can be 
used as a concise descriptor of the video content, more indexing information 
should be extracted to allow for efficient automatic searching. This is due to the 

following reasons: 

• The key-frames contain raw image data, while video searching is 

done based on image attributes. 

• Some attributes, such as image motion information, cannot be 
extracted accurately from key-frames alone. 



16 
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. Practical limitations on the computing power inside set-top boxes 
require that video search engines inside such boxes will operate on 
concise indexing data. 

Video indexing data is automatically computed from the video image 
sequence by video image indexing engines 620. Such engines may include a 
face detection engine 621. a motion indexing engine 622. a video text 
recognition engine 623, and / or a color indexing engine 624. 

Audio indexing data is automatically computed from the audio track by 
audio indexing engines 630. Such engines may include: segmentation to 
silence, speech, music and effects 631; feature extraction for audio 
classification 632; and recognition of pre-programmed effects 633. 

Prior art methods are known and may be used for implementing each 
of the above mentioned indexing engines 620 - 633. 

Sometimes video streams carry video meta-data such as closed 
captions, and possibly encoded textual information such as annotations. 
Meta-data decoder 640 extracts this meta-data which is added to content-based 
indexing data. Manual annotations can also be added by annotation editor 650. 
In a live feed situation, the volume of such descriptions is limited due to time 
constraints. However, they provide additional information about the video 
content. 

All video indexing data is time-stamped according to a global clock. 

FIG. 7 illustrates a search menu 700 overlaid on the television display 
by a graphic generator that mixes the graphic video signal with the receiver 
video signal. The search menu consists of a set of content-based attributes 
such as visual attributes 710. audio attributes 720, topic-related attributes 730, 
and special attributes 740 such as breaking news or explosions. The search 
menu also includes a simple query language 750 that allows selecting "AND", 
"OR" and -NOT" control functions, for generating and displaying, in a display 
region 760. such queries as: 

VISUAL = People AND AUDIO = Laughter 
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Such a search menu is simple to operate and requires only a minimal 
user-interface. However, the indexing data transmitted to the viewer can support 
a wide range of video queries. For that purpose, a computer-based interface 
can be used to define a set of queries on the viewer's personal computer and to 
download the set of queries to the set-top box. Once downloaded, these queries 
can be selected by the remote controller handset 120 (FIG. 1B). In other 
TV-PC combinations, the query definition is supported more easily. 

As mentioned above, the method for computing attribute-specific 
indexing data and for querying these attributes can be implemented by methods 
known in the art. For illustrative purposes, a simple example is described below 
teaching how to search for explosions based on the video image track. 

The search is implemented, as described in FIG. 9, by a combination 
of indexing and searching. Indexing 910 consists of decimating the key-frame 
sequence that is computing a low-resolution version of the images. 
Low-resolution color representations support a wide variety of color-based 
queries. 

FIG. 8 shows a low-resolution frame sequence obtained by decimating 
the key-frame sequence. 

Searching for explosions in the indexing data is performed by 
computing the "fire magnitude" 920 at each frame. The fire magnitude value is 
computed by summing a quantity inversely related to the color distance from a 
pre-specified fire color value, over all pixels in the low-resolution image as 
shown by blocks 930 and 940 in FIG. 9. FIG. 10 shows the fire magnitude for 
the sequence of FIG. 8. The concise one-dimensional fire-magnitude sequence 
is processed by a derivative and threshold logic 950 to decide on a candidate 
explosion event. 

The present invention can be implemented in additional embodiments 
other than those described above in which the multi-channel video indexing 
data are transmitted to the set-top box controller (FIG. 1B), or to a central 
control node (FIG. 2) that conducts the searching. 

18 
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Thus, the automatic television channel selection may be implemented 
over the Internet as illustrated in FIG. 1 1 , wherein a set of video channels are 
indexed by content by the combination of a digital video decoder 1110 and a 
video indexer 1 120 for each of the indexed channels. The index data is stored 
in a web server 1130 on the Internet 1131. The web server uses an Internet 
TCP/IP protocol to make the indexing data available to users at the PC and TV 
combinations 1 1 40, 1 1 50. 1 1 60 and 1 1 70. 

Because of the growing overlap between the TV, the personal computer and the 
Internet, more configurations now support data integration between these elements. 
Several such configurations are illustrated in FIG. 11. These include: the 
configuration 1140 of TV viewing on a personal computer 1141 that has at least 
one TV tuner 1142 but is also connected to the Internet: the configuration 1150 of 
Internet-capable set-top boxes 1 151 with a video- search engine 1 152, and at least 
one TV tuner; the configuration 1160 of a computer connected directly to the 
television receiver, and the configuration 1170 of a personal computer only. 
Streaming video via an Internet connection, such as a broadband connection, is a 
replacement for a television signal connection. 

Video streaming over the net is becoming more and more a reality, for example in 
"broadcast.com". Users of such sites are presented with listings of broadcast 
(video) material from multiple broadcast (video) channels. Video may be viewed 
and downloaded. With the present invention, users of such sites will be able to 
benefit from "event-driven : " information, with program content provided by 
content-providers sorted according to viewer's preference lists, and all other 
additional characteristics that are described herein. In figure 11, the broadcast 
content is streamed via an Internet broadband connection. The user can select a 
user profile, formulate a query or provide a username for selection of previously 
defined search criteria. In a preferred embodiment, search results in the form of a 
listing of the currently available channels that meet the user-defined criteria, or a 
thumbnail presentation of the content of these channels in the form of updating 
key-frames can be put by the web-server as an HTML page and sent to the user. 
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By clicking on at least one list item or one channel key-frame window, the selected 
video stream is buffered and played on the user terminal. 

The identification of a channel meeting user-defined criteria is a process 
of finding a match between attributes in user-input query, or predefined attribute 
listing, and the indexed data. In case of multiple attributes and conjunctions, a 
match score may be given in reference to the number of attributes and conjunctions 
met (for example, the number of elements present). Alternatively, an all-or-nothing 
scheme may be used, in which a match is defined when all attributes and 
conjunctions are met, otherwise no match. Multiple matches correspond to multiple 
channels. Prioritizing between the channels may be introduced by utilizing the 
relative match scores; alternatively, viewer may set relative weights to the set of 
user-defined attributes; alternatively, video programs may be prioritized utilizing a 
viewer program preference table and history profile. 

Once a video channel is automatically identified as containing the 
category and attribute of interest, the information may be conveyed to the 
viewer at the receiver location in several ways. In one embodiment the program 
(or a keyframe program representation) may be displayed in a secondary video 
display window as a picture-in-picture (PIP). A variety of PIP settings are known 
in the art. These may include partitioning of the main video display window into 
several smaller windows surrounding the main display, or a secondary display 
window as a small window top right, and others. In another embodiment, 
program identification is displayed as a listing on part of the display window. 
Program identification may include the program title (e.g. ll CNN"), any defining 
attributes ("breaking news" segment) and any additional information desired 
(e.g., in a pay-per-view channel: -selecting this channel will cost X per hour"). 
In yet another embodiment, a signal may be used (such as a blinking signal or a 
sound signal) to notify of an event. 

Following a channel identification event, the user at the receiver 
location may select via the controller 213. FIG. 2. one of several action items, 
including a viewing option and a recording option. In the viewing option, the 
user may select to switch over to a full-screen view of the selected channel. In 
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the recording option, the user may select to record the content. The viewer may 
choose to select amongst these options interactively, following the notification 
of an event; alternatively, the viewer may decide on a fixed setting (e.g. record 
all events occurring during the 6-8pm time slot). For example, while a viewer is 
watching the sports channel, the viewer is notified in a picture-in-picture setting 
that there is a "breaking news'" segment in the CNN channel whereupon the 
viewer may decide to select the CNN channel for viewing; alternatively, the 
•breaking news" may be automatically recorded. 

A variety of techniques exist in the prior art that enable channel 
selection based on specific programs or categories, as selected by the viewer. 
Interactive television systems establish a database of viewer preferences based 
on particular characteristics previously delivered to the viewer. The system 
compares the viewer preference list to the video programming available at the 
selected time, and identifies the video programming which has the greatest 
degree of correlation. The main goal is to generate a personalized channel 
guide based on program preferences and times personalized for the individual 
viewer. An example of such a guide is shown in the following table: 





8:00-8:30 


8:30-9:00 


9:00-9:30 


9:30-10:00 


Sports 


Soccer 
Game 




NBA 




Drama 


Gone with 
the Wind 








News 


CNN 


NBC 


ABC 





In the present invention, an "event-driven" electronic program guide 
(EPG) can be generated. A block diagram for generating the event-driven EPG 
listing is shown in FIG 12-A. Video indexing data 1210 (generated based on a 
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set of predefined attributes) and user-specified attributes 1211,1212 are 
processed by a video search engine 1213, and the resultant program/events 
schedule is displayed in a listing 1214. The listing may be a scrolling listing, 
allowing the viewer to follow in real time, the video content from a number of 
channels, such that the content matches specific topics of interest. The listing 
may also entail future scheduled programs and events, in which case the listing 
entails programs and events, channel identification and time segments (such as 
the beginning and the end time of the segment of interest). Schedules of events 
that are non real-time are generated for any video material available that is non 
real-time material. An example of an event-based EPG real-time scrolling listing 
is shown in FIG. 12-B. At each time in which there is one or more identified 
matches, the identified events are listed along with the corresponding video 
channel identifications. The continuous scrolling listing of the events may be 
presented on a small portion of the main display screen, or as a buffered 
secondary screen. The viewer may be interested in future scheduling. A variety 
of scheduling screens are available to the viewer, as shown in FIG. 12-C. The 
display screens include a display that is channel based (for each channel, time 
schedule and attributes are listed), or a display that is time schedule based 
(time segment is listed, together with the attribute/event and corresponding 
program identification). 

The event-driven electronic program guide may be generated as a 
personalized guide, personalized to the particular viewer attributes 1212. The 
event-driven electronic program guide may be generated as a global listing, 
containing all attributes, as chosen by the respective video channel providers, 
or predefined in the centralized control server, or as combined across multiple 
user preferences, 1211. 

Referring to the two embodiments of FIG. 1B and FIG. 2, the indexing 
data for each of the program channels could be generated in the video indexing 
modules 115 or in the central control node 220, respectively. The video search 
engine and channel selection units, at the set-top box 130, alternatively, the 
server at the central control node 220. identify from the indexing data the 
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program channels to transmit programs, and the scheduled transmission times 
thereof, having a match with respect to the specified one or more attributes, to 
thereby produce a program schedule. The personalized attributes may be input 
via the remote control handset 121 or user interface in the set-top box 136, or 

5 alternatively, the controller 213 at each viewer station 210 could be used for 
specifying one or more attributes corresponding to desired program content 
data. The global set of attributes by which a schedule is generated, may be 
derived at the server 222 in the central control node 220. The global set can be 
extracted from a set of attributes as chosen by the video content providers, or 

in by collecting attributes from a set of viewers, or by utilizing a history profiling of 
the viewers, or via some combination logic of the personalized attributes lists. 

Generating a viewing history profile for a viewer may include storing a 
viewer preference database of programs viewer selects or receives, as known 
in prior art. In this invention, attributes and events are incorporated in the 

is profiling. The handset 120 in FIG. 1B, or the controller 213 in FIG. 2 at the 
receiver location, may generate and store a viewing-history profile of the 
programs viewed and the attributes requested at the respective receiver 
location. Such history profile may be utilized for prioritizing the programs 
identified, e.g., in the server 222 of the central control node 220 having the 

20 attributes specified by the respective user at the receiver location. 

In recent years, smart TV, or "time-shifted" TV have been developed 
which, via recording systems that use hard-drives and computer chips, record 
and store television programming so that people can watch whatever they want, 
whenever they wish. With such recording devices, the viewer at home may 

25 choose between viewing a selected channel and recording it. The present 
invention may also be used in such systems to record content-based events 
from a plurality of channels according to designated events or events based on 
history-profile preferences. 

A yet additional application of the invention is to generate a report of 
program and event statistics. A professional user may be interested to gather 
appearance statistics for a particular logo of interest or a particular clip, for 
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example. In this scenario, the event of interest is the logo (clip). This event is 
automatically indexed in the transmitter side and the indexed channel is 
received by the receiver (FIG. 1B) or by the server at the central control node 
(FIG. 2). Any indexed event is recorded in the recording device (as known in 
prior art). At the end of a specified time period (e.g., several hours, one day, 
night shows etc), statistics may be gathered. Examples are: number of 
occurrences per channel, overall time allocated in all segments, and so forth. 

FIG. 13 is a flow diagram illustrating a method for generating 
topic-oriented video summaries (block 1310). In this method, video key-frames 
and video indexing data are processed by a video search engine 131 1, and the 
query results are arranged in a storyboard, multi-frame display 1312. This 
display allows the viewer to follow, in real time, the video content from a number 
of channels, such that the content matches specific topics of interest. 

The topic summary engine 1310 is similar in implementation to the 
channel selection method taught by the present invention and includes a query 
processing module 1313 communicating with a query definition user interface 
1314. However, the purpose of the system in FIG. 13 is to allow topic-oriented 
multi-channel browsing rather than to select a specific channel. 

The present invention can thus be applied to various arrangements 
where the user of viewer can select between multiple video or multimedia 
programs. Such arrangements include broadcasting, webcasting and other 
internet-television implementations, telecasting, video on demand, near video 
on demand, and interactive television. 

While the invention has been described with respect to certain 
preferred embodiments, it will be appreciated that these are set forth merely for 
purposes of example, and that many other variations, modifications and 
applications of the invention may be made. 
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What is Claimed is: 

1 . A method of selecting, at a video receiver location, a desired video 
program channel from a number of program channels transmitting video 
programs, comprising: 

5 automatically generating, for each of said program channels, indexing 

data of at least one predetermined attribute based on the video program content 
of the respective channel; 

specifying at least one attribute corresponding to a desired program 

content; 

in and identifying, from said indexing data, any program channel having a 

match with respect to the attribute specified. 

2. The method according to claim 1, wherein said indexing data is 
generated at a remote location, is encoded and transmitted in a separate 
indexing channel for all said program channels, and is received and decoded at 

1 5 said receiver location; and wherein said at least one attribute corresponding to 
a desired program is specified at said receiver location. 

3. The method according to claim 2, wherein said indexing data from 
a plurality of channels is multiplexed into a data stream before transmission. 

4. The method according to claim 1, wherein said indexing data is 
20 generated from selected key-frames of the respective video program. 

5 The method according to claim 1, wherein said indexing data is 
tagged with a channel identification code and with a time tag. 

6. The method according to claim 1 , wherein said generated indexing 
data includes both image and audio attributes. 
25 7. The method according to claim 1, wherein one program channel is 

identified having the best match with respect to the attribute specified, and at 
least one additional program channel is identified having the next-best match 
with respect to the specified attribute. 

8. The method according to claim 1 S wherein, in order to assist 
:,o specifying at said receiver location said at least one attribute corresponding to a 
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desired program content, there is displayed, at said receiver location, a search 
menu setting forth a plurality of different attributes selectable by the user. 

9. The method according to claim 8, wherein there are also displayed 
"AND", "OR" and "NOT" control functions also selectable by the user. 

10. The method according to claim 1, wherein at least one of said 
attributes is the occurrence of an explosion in the video program, said indexing 
data including characteristic data indicative of fire-like color distribution in 
selected frames of the respective video program. 

11. The method according to claim 1, wherein the generated indexing 
data is stored in a web server in the internet. 

12. The method according to claim 1, wherein the program channel 
best matching the specified attribute is displayed as a picture within a picture on 
the screen of the video receiver. 

13. The method according to claim 1, wherein the identification of the 
program channel matching the specified attribute is displayed on the screen of 

the video receiver. 

14. The method according to claim 1, wherein the program channel 

matching the specified attribute is recorded. 

15. The method according to claim 1, wherein a viewer at the receiver 
location preselects, via a user interface, whether a program channel identified 
as having a match with a specified attribute is to be recorded or to be 
immediately displayed on a video receiver at the receiver location. 

16. The method according to claim 1, wherein said at least one 
predetermined attribute is used for generating a program guide setting forth a 
program schedule of channels to contain a video program content based on 
said at least one predetermined attribute. 

17. The method according to claim 1. wherein said at least one 
specified attribute includes a particular event desired to be identified if 
occurring on any of said program channels. 
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18. The method according to claim 1. wherein said indexing data is 
transmitted in an indexing channel and is received and decoded at said receiver 
location. 

19. The method according to claim 1. wherein said indexing data is 
used at a central control node for selecting programs to be transmitted to a 
plurality of viewer stations at a plurality of receiver locations according to the 
attribute specified at the respective receiver locations. 

20. The method according claim 19, wherein said indexing data is 
transmitted in an indexing channel to said central control node. 

21. The method according to claim 19, wherein said central control 
node also uses said indexing data together with a history-profile of at least 
some of said viewer stations, for listing the programs to be transmitted to the 
respective viewer stations. 

22. The method according to claim 1, wherein the video receiver 
location generates and stores a history-profile of programs viewed at said video 
receiver location, and utilizes said history-profile for prioritizing the identified 
programs having the attribute specified. 

23. The method according to claim 1, wherein the video receiver 
location or the server at a central control node, generates a statistical report of 
occurrences of said specified attribute. 

24. The method according to claim 1 , wherein the specified attribute 
relates to a particular topic of interest, and said video receiver location or the 
server at the central control node, generates a summary of occurrences of said 
topic of interest in the video channels. 

25. A method for indicating at a video receiver location, the occurrence 
of a particular event when occurring on any of a number of program channels 
transmitting video programs, comprising: 

automatically generating at a remote location, for each of said program 
channels, indexing data of the respective video programs; 

encoding and transmitting said indexing data for all said channels; 
receiving and decoding said indexing data; 
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specifying at said receiver location said particular event; and 
identifying from said indexing data each occurrence of the particular 
event on any of the program channels. 

26. The method according to claim 25. wherein said indexing data is 
transmitted in a separate indexing channel and is received and decoded at said 
receiver location. 

27. The method according to claim 25, wherein said indexing data is 
used at a central control node for selecting programs to be transmitted to a 
plurality of viewer stations at a plurality of receiver locations according to the 
attribute specified at the respective receiver location. 

28. A method of selecting, at a plurality of viewer locations, a desired 
video program from a plurality of video programs transmitted in a plurality of 
program channels, comprising: 

automatically indexing, at a remote location, attributes of each of said 
video programs transmitted in said program channels; 

transmitting said attributes of each of said video programs in said 

program channels; 

receiving, at a central control node, the video programs and the 

attributes thereof; 

specifying, at each of said viewer locations, particular attributes of a 
video program desired to be viewed at the respective viewer location; and 

utilizing, at said central control node, said attributes specified at said 
viewer location for identifying the video programs matching said specified 
attributes. 

29. The method according to claim 28 5 wherein said attributes of each 
of said video programs in said program channels are transmitted in a separate 
indexing channel. 

30. The method according to claim 29. wherein said central control 
node also uses said attributes transmitted in said indexing channel, together 
with a history-profile of at least some of said viewer stations, for listing the 
programs to be transmitted to the respective viewer station. 
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31. A method of generating a program schedule of desired video 
program channels from a number of program channels transmitting video 
programs of various program contents, comprising: 

automatically generating, for each of said program channels, indexing 
data of at least one predetermined attribute based on the content of the 
programs to be transmitted on the respective channel, and the scheduled 
transmission time thereof; 

specifying at least one attribute corresponding to a desired program 

content; 

and identifying, from said indexing data, the program channels and the 
scheduled transmission times thereof, having a match with respect to the 
specified attribute to thereby produce a program schedule of said program 
channels. 

32. The method according to claim 31, wherein said indexing data is 
generated at a remote location, and is encoded and transmitted in a separate 
indexing channel for all said program channels. 

33. The method according to claim 31, wherein said indexing data is 
received, decoded, and utilized at a receiver location for identifying the program 
channels to transmit programs having a match with respect to the specified 
attribute, and the scheduled transmission times thereof. 

34. The method according to claim 31, wherein said indexing data is 
received, decoded, and utilized at a central control node for identifying the 
program channels to transmit programs having a match with respect to the 
specified attribute, and the scheduled transmission times thereof. 

35. The method according to claim 31, wherein a plurality of attributes 
are specified each corresponding to a desired program content, and each 
program having a match with a specified attribute is identified and included, 
together with its scheduled transmission time, in said program schedule. 

36. The method according to claim 1 , wherein said video receiver is a 
computer connected to the Internet. 
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